De Novo Genome Assembly    ◾    103

(e.g., NCBI genome) for the organism, and use “-r” and “-g” options as follows (you may

need to decompress GFF file) (Figures 3.11 and 3.12):

quast.py \

-o ref_quast_Ecoli_ass \

-t 4 \

-r ecolref/GCF_000005845.2_ASM584v2_genomic.fna.gz \

-g ecolref/GCF_000005845.2_ASM584v2_genomic.gff \

abyss_ecoli_ass.fasta \

spades_ecoli_ass.fasta \

spades_hyb_ecoli_ass.fasta

3.3.2  Evolutionary Assessment for De Novo Genome Assembly

Rather than contig or scaffold length distributions such as N50 and L50, the evolutionary

assessment for a genome assembly is based on the completeness of a genome on the evo-

lutionary informed expectation of genes inferred from closely related orthologous groups

of sequences. They assess the completeness of a genome assembly in terms of gene content.

BUSCO (Benchmarking Universal Single-Copy Orthologs) [12, 13] is an evolutionary-

based quality assessment program that uses information of known genes from a database

FIGURE 3.11  QUAST assembly assessment report (reference-guided assessment).